Some Applications of Statistical Methods to the
Analysis of Physical and Engineering Data
By W. A. SHEWHART
Synopsis : Whenever we measure any physical quantity we cus-
tomarily obtain as many different values as there are observations.
From a consideration of these measurements we must determine the
most probable value; we must find out how much an observation may
be expected to vary from this most probable value; and we must learn
as much as possible of the reasons why it varies in the particular way
that it does. In other words, the real value of physical measurements
lies in the fact that from them it is possible to determine something of
the nature of the results to be expected if the scries of observations
is repeated. The best use can be made of the data if we can find from
them the most probable frequency or occurrence of an}' observed
magnitude of the physical quantity or, in other words, the most prob-
able law of distribution.
It is customary practice in connection with physical and engineering
measurements to assume that the arithmetic mean of the observations
is the most probable value and that the frequency of occurrence of
deviations from this mean is in accord with the Gaussian or normal
law of error which lies at the foundation of the theory of errors. In
most of those cases where the observed distributions of deviations have
been compared with the theoretical ones based on the assumption of this
law, it has been found highly improbable that the groups of observa-
tions could have arisen from systems of causes consistent with the
normal law. Furthermore, even upon an a priori basis the normal law
is a very limited case of a more generalized one.
Therefore, in order to find the probability of the occurrence of a
deviation of a given magnitude, it is necessary in most instances to find
the theoretical distribution which is more probable than that given by
the normal law. The present paper deals with the application of ele-
mentary statistical methods for finding this best frequency distribution
of the deviations. In other words, the present paper points out some
of the limitations of the theory of errors, based upon the normal law,
in the analysis of physical and engineering data ; it suggests methods
for overcoming these difficulties by basing the analysis upon a more
generalized law of error ; it reviews the methods for finding the best
theoretical distribution and closes with a discussion of the magnitude
of the advantages to be gained by either the physicist or the engineer
from an application of the methods reviewed herein.
Introduction
WE ordinarily think of the physical and engineering sciences as
being exact. In a majority of physical measurements this is
practically true. It is possible to control the causes of variation so
that the resultant deviations of the observations from their arithmetic
mean are small in comparison therewith. In the theory of measure-
ments we often refer to the "true value" of a physical quantity: ob-
served deviations are considered to be produced by errors existing in
the method of making the measurements.
43
44 BELL SYSTEM TECHNICAL JOURNAL
With the introduction of the molecular theory and the theory
of quanta, it has been necessary to modify some of our older con-
ceptions. Thus, more and more we are led to consider the problem
of measuring any physical quantity as that of establishing its most
probable value. We are led to conceive of the physico-chemical
laws as a statistical determinism to which "the law of great num-
bers" ' imparts the appearance of infinite precision. In order to
obtain a more comprehensive understanding of the laws of nature
it is becoming more necessary to consider not only the average value
but also the variations of the separate observations therefrom. As
a result, the application of the theory of probabilities is receiving
renewed impetus in the fields of physics and physical chemistry.
Statistical Nature of Certain Physical Problems. As typical of the
newer type of physical problem, we may refer to certain data given by
Prof. Rutherford and H. Geiger. 2 In this experiment the number of
alpha particles striking, within a given interval, a screen subtending a
fixed solid angle was counted. Two thousand six hundred and eight
observations of this number were made. The first column of Table I
records the number of alpha particles striking this screen within a
given interval. The second column gives the frequency of occurrence
corresponding to the different numbers in the first column.
TABLE I
No. of Alpha Observed Frequency
Particles of Occurrence
57
1 203
2 383
3 525
4 532
5 408
6 273
7 139
8 45
9 27
10 10
11 4
12
13 1
14 1
It is obviously impossible from the nature of the experiment to at-
tribute the variations in the observed numbers to errors of observa-
tion. Instead, the variations are inherent in the statistical nature
of the phenomenon under observation.
1 Each class of event eventually occurs in an apparently definite proportion
of cases. The constancy of this proportion increases as the number of cases
increases.
3 Philosophical Magazine, October, 1910.
APPLICATION OF STATISTICAL METHODS 45
The questions which must be answered from a consideration of these
data are typical. For example, we are interested to know how a
second series of observations may be expected to differ if the same
experiment were repeated. The largest observed frequency corre-
sponds to four alpha particles, although what assurance is there that
this is the most probable number? What is the probability that any
given number of alpha particles will strike the screen in the same
interval of time? Or again, what is the maximum number of alpha
particles that may be expected to strike the screen? All of these
questions naturally can be answered providing we can determine
the most probable frequency distribution.
Statistical Nature of Certain Telephone Problems. The character-
istics of some telephone equipment cannot be controlled within
narrow limits much better than the distribution of alpha particles
could be controlled in the above experiment. We shall confine our
attention primarily to a single piece of equipment. The carbon micro-
phone. For many reasons it is necessary to attain a picture of the
way in which a microphone operates. It is necessary to find out
why carbon is the best known microphonic material. In order to
do this we must measure certain physical and chemical characteristics
of the carbon and compare these with its microphonic properties
when used under commercial conditions. In the second place it
becomes necessary to establish methods for inspecting manufactured
product in order to take account of any inherent variability, and
yet not to overlook any evidence of a "trend" in the process of manu-
facture toward the production of a poor quality of apparatus. In
the third place it so happens that the commercial measure of the
degree of control exhibited in the manufacture of the apparatus
must be interpreted ultimately in terms of sensation measures given
by the human ear. That is, the first phase of the problem is purely
physical; the second is one of manufacturing control and inspection
and the third involves the study of a variable quantity by means
of a method of measurement which in itself introduces large variations
in the observations.
In one of the most widely used types of microphones there are
approximately 50,000 granules of carbon per instrument. Each of
these granules is irregular in contour, porous and of approximately
the size of the head of a pin. If such a group of granules is placed
in a cylindrical lavite chamber about 5-inch in diameter and closed
at either end with gold-plated electrodes; if this chamber is then
placed on a suspension free from all building vibrations and carefully
insulated from sound disturbances; if automatically controlled
46
BELL SYSTEM TECHNICAL JOURNAL
mechanical means are provided for rolling this chamber at any desired
speed ; if all of the air and sorbed gases are removed from the carbon
chamber and pure nitrogen is substituted; if the mean temperature is
kept constant within 2° C ; and if means arc provided for measuring the
resistance of the granules when at rest by observing the voltage
across the two electrodes while current is allowed to flow for a period
less than 1/200 of a second, it is found that the resistance (for most
samples of carbon) may be determined within a fraction of one per
Fig. 1
cent. If, however, the button is rotated (even as slowly as possible)
and then brought to rest, the resistance may differ several ohms from
its first value. If a large number of observations are made after
this fashion, we may expect to find for certain samples of carbon a
set of values such as given in Fig. 1. The 270 observations of re-
sistance reproduced in this figure were made on a sample of carbon
at \ x /2 volts under conditions quite similar to those outlined above.
The observed variation is from approximately 215 to 270 ohms.
The upper curve is that of the resistance vs. the serial number of
the readings. There is no apparent trend in the change of resistance
from one reading to another. The lower curve in this figure shows
the frequency histogram of the results. Attention is directed to the
APPLICATION OF STATISTICAL METHODS
47
wide variation in the observations, and to the fact that the frequency
histogram appears to be bimodal. 3 Methods of dealing with such
distributions will be considered.
Samples of carbon having different molecular surface structures
have different resistances. To put it in a still more practical way,
if the manufacturing process is not controlled within very narrow
limits, wide variations are produced in the molecular properties ot
the carbon. The microphonic properties of these carbons are there-
fore different. One of the problems with which we have been con-
cerned is to determine the relationship existing between the physical
and chemical characteristics of the carbon and the resistance of the
material when measured under different conditions. We are obvi-
ously dealing in this case with problems involving the measurement
of physical quantities which cannot be controlled even in the labora-
V
/hstantmeovs Resistance r* \Zoutass
3 oo
\
H.r. - Ft».;i o, cc, ..... ........ .....
\
\
%
\
1
1
X
\
\
S
k
1
K
\
\
\
V
^^■"^
<£
- *^
V
N
V
^~
'
—
^L^
___
— "^S*.
^^~
4
i
1
t i
6 I
a 2
4 *
6 1
r j
1 4
o 4
« 4
» >l
VoLTfiqi - Volts
Fig. 2
3 If curves which touch the axis at + oo and — oo have more than one value
of the variable for which the derivative of the frequency in respect to the
variable is equal to zero — the points being other than that for which the fre-
quency is zero — these curves are referred to as bimodal, trimodal, etc. The
modal value is the most probable one and is of particular interest in uni-
modal curves.
4S
BELL SYSTEM TECHNICAL JOURNAL
tory. If we remove the air and measure the resistance at different
voltages, we may expect to find changes in the resistance similar
to those indicated in Fig. 2. Curves 1 and 2 were taken for increasing
voltages. The return curves were taken with decreasing voltage.
Removal of the air from this particular sample of carbon produces
comparatively large changes in the resistance. The resistance at \Yi
volts is several times that at 48 volts. These curves were taken
under conditions wherein all of the other factors were controlled. A
sufficient number of observations was made in each case in order to
establish the probable errors of the points as indicated by the radii of
Fig. 3 — Possible Types of Breathing of Granular Carbon Microphone.
the circles. If this same experiment were carried on at a different
temperature, radically different results would be obtained.
If, instead of allowing the current to flow for a short interval of
time, a continuous record is made of the resistance of the carbon
while practically constant current flows through the carbon, the
resistance will be found to vary. The maximum resistance reached
in certain instances may amount to several times the minimum value.
In general, this phenomenon is attributed to the effects of gas sorbed
on the surface of the material. Transmitters cannot be made of
lavite so that the expansions and contractions of the piece parts
thereof augment the changes in resistance. This phenomenon,
termed "breathing," may be, but seldom is, regular or periodic. An
exceptional case of breathing is shown in Fig. 3. This was obtained
with a special type of carbon in' a commercial structure. The curves
APPLICATION OF STATISTICAL METHODS 49
themselves represent the current through the transmitter and, there-
fore, are inversely proportional to the resistance. All five curves
were obtained with the same carbon in the same chamber by varying
merely the configuration of the granules by slightly tapping the carbon
chamber.
All of these effects can be modified to a large extent by varying
the process of manufacture of the granular material. In practice it
is necessary to know why slight changes in the manufacturing process
cause large variations in the resistance characteristics of the carbon.
The same process that improves one microphonic property may prove
a detriment to another. It is in the solution of some of these problems
that statistical methods have been found to be of great value in the
interpretation of the results.
Whereas the physicist ordinarily works in the laboratory under
controlled conditions, the engineer must work under commercial
conditions where it is often impractical to secure the same degree of
control. More than 1,500,000 transmitters are manufactured every
year by the Bell System. Causes of variation other than those intro-
duced by the carbon help to control the transmitter. For example,
variations may be introduced by the process of assembly, or by
differences in the piece parts of the assembled instrument. The
measure of the faithfulness and efficiency of reproduction depends
fundamentally upon the human ear. Obviously all transmitters
cannot be tested. Instead, we must choose a number of instruments
and from observations made on these determine whether or not there
is any trend in the manufactured product. Naturally we may expect
to find certain variations in the results according to the rules of chance.
To take the simplest illustration, we may flip a coin 6 times. Even
if it is symmetrical we may expect occasionally to find all heads and
occasionally all tails, although the most probable combination is
that of 3 heads and 3 tails. We must, therefore, determine first of all
whether or not the observed variations are consistent with those due
to sampling according to the laws of chance. If there is an apparent
trend in product, the data should be analyzed in order to determine,
if possible, whether it is clue to lack of control in the manufacture
of carbon or to some other set of causes such as mentioned above.
Because of economic reasons we must keep the number of observations
at a minimum consistent with a satisfactory control of the product.
Here again it has been found that the application of statistical methods
is necessary to the solution of the problems involved.
Before considering the problem of the measurement of efficiency
and quality of the transmitter, let us consider the schematic diagram
50
BELL SYSTEM TECHNICAL JOURNAL
of the telephone system as shown in Fig. 4. Essentially this consists
of the transmitter, the line and the receiver. The oldest method
of measurement is to compare one transmitter against a standard
in the following way. An observer calls first in the standard and
then in the test transmitter, while another observer at the receiving
end judges the faithfulness of reproduction. The pressure wave
striking the transmitter diaphragm varies with the observer and also
with the degree of mechanical coupling between the sound source
Ftceivtr
Fig. 4
\
and the diaphragm of the instrument. The judgment of the observer
at the receiving end in influenced by physiological and psychological
causes. Obviously it is desirable that such a method be supplanted
by a machine test which will eliminate the variabilities in the sound
source and in the human ear. Up to the present time the nature of
speech and the characteristics of the human ear are not known suffi-
ciently well to establish either an ideal sound source or an electrical
meter to replace the human voice and ear respectively. The best that
can be done is to approximate this condition. Even though the
meter readings may be the same, the simultaneous observations
made with the ear in general will be different. A calibration of the
machine must, therefore, depend upon a study of the degree of correla-
tion between the average measure given by the machine and that
given by the older method of test.
Thus, we see how special problems arise in the fields of both physics
and engineering wherein it is impossible to control the variations.
In what way, if any, are these problems related, or is it necessary to
attack each one in a different manner? We shall see that all of
these problems are in a way fundamentally the same and that the
same method of solution can be applied to all of them. This is true
because it is necessary to determine in every instance the law of
distribution of the variable about some mean value.
APPLICATION OF STATISTICAL METHODS 51
Why Do We Need to Know the Law of Deviation of the
Different Observations About Some Mean Value?
In all of the above problems as in every physical and engineering
one, certain typical questions arise which can be answered only if we
know the law of distribution y=f(_x) of the observations where y
represents the frequency of occurrence of the deviations x from some
mean value. At least three of these questions are the same for both
fields of investigation. 4
Let us consider the physical problem. From a group of n observa-
tions of the magnitude of a physical quantity, we obtain in general n
distinct values which can be represented by X\, X 2 , . . . X n . From
a study of these we must answer the following questions :
1. What is the most probable value?
2. What is the frequency of occurrence of values within any two
limits?
3. Is the set of observations consistent with the assumption of a
random system of causes?
The answers to these questions are necessary for the interpretation
of Prof. Rutherford's data referred to above: They are required in
order to interpret the data presented in Fig. 1 which are typical of
physical and chemical problems arising in carbon study; these same
answers are fundamentally required in the analysis of all physical data.
These questions can be answered from a study of the frequency dis-
tribution. If this be true, it is obvious that the statistical methods
of finding the best distribution are of interest to the physicist.
Let us next consider the engineering problem where we shall see
that the same questions recur. Assuming that manufacturing
methods are established to produce a definite number of instruments
within a fixed period, one or more of the characteristics of these
instruments must be controlled. We may represent any one of
these characteristics by the symbol X. The total number of instru-
ments that will be manufactured is usually very indefinite. It is,
however, always finite. Even with extreme care some variations
in the methods of manufacture may be expected which will produce
' !n order to calibrate the machine referred to in a preceding paragraph
and also to determine the relationships between the physico-chemical and micro-
phonic properties of carbon, it was necessary to study the correlation between
two or more variables, but in each case it was necessary to determine first the
law of distribution for each variable in order to interpret the physical signifi-
cance of the measures of correlation because this depends upon the laws of
distribution. The reason for this is not discussed in the present paper, for
attention is here confined to the method of establishing the best theoretical
frequency distribution derived from a study of the observations.
52 BELL SYSTEM TECHNICAL JOURNAL
variations from instrument to instrument in the quantity X. After
the manufacturing methods have been established, the first problem
is to obtain answers to the following questions:
1. What is the most probable value of XI
2. What is the percentage of instruments having values of X
between any two limits?
3. Are the causes controlling the product random, or are thev
correlated? B
In this practical case we must decide to choose a certain number
of instruments in order to obtain the answers to these questions;
that is, to obtain the most probable frequency distribution. We
must, however, go one step further. We must choose ;i certain
number of instruments at stated periods in order to determine whether
or not the product is changing. How big a sample shall we choose
in the first place, and how large shall the periodic samples be? Obvi-
ously it is of great economic importance to keep the sample number
in any case at a minimum required to establish within the required
degree of precision the answers to the questions raised.
The close similarity between the physical and engineering problems
must be obvious. Naturally, then, we need not confine ourselves
in the present discussion to a consideration of only the problems
arising in connection with the study of those microphonic properties
of carbon which gave rise to the present investigation. Several
examples are therefore chosen from fields other than carbon study.
However only those points which have been found of practical ad-
vantage in connection with the analysis of more than 500,000 observa-
tions will be considered.
The type of inspection problem may be illustrated by the data
given in Table II.
The symbol X refers to the efficiency of transmitters as determined
in the process of inspection : N represents the number of instruments
measured in order to obtain the average value X. The first four
rows of data represent the results obtained by four inspection groups
G\, Go, G 3 and G\. The results given are for the same period of time.
The next three rows are those for different machines Mi, M 2 and M 3 .
The last row gives the results of single tests on 68,502 transmitters, a
part of which was measured on each of the three machines. The third
column in the table gives the standard deviations. It will be observed
6 The significance of this question will become more evident in the course of
the paper. We shall find that, if the causes are such as to he technically termed
random, we can answer all practical questions with a far greater degree of
precision than we can if the causes are not random.
APPLICATION OF STATISTICAL METHODS 53
TABLE II
Inspection Data on Transmitters
X
a
3tr
.V
k
a
k
ft
a
f**-
a
X
k
Pi
Pear-
Vn
Type
c,
Cn
.548
.740
.766
.934
. 739
.896
.762
.677
.0131
. 0533
.0568
.0398
4510
2540
1620
2610
- .214
- .949
- .109
-1.413
.056
.049
.061
.048
4.152
4.426
5.176
7.677
.073
.097
.122
.096
.011
.018
.019
.013
.108
.147
.183
.144
.219
.291
.366
.288
IV
VI
VI
IV
Mi
M 2
M 3
-1.66
-1.69
-1.79
1.32
1.07
1.04
. 0386
.0300
.0510
10855
11577
3749
- .70
- .84
- .56
.024
.023
.040
3.128
4.240
3.628
.047
.046
.080
.013
.010
.017
.072
.069
.120
.141
.138
.240
Machines
1, 2, 3
-1.641
1.14
.0131
68502
- .80
.009
Out
.004
.027
I
that comparatively large differences exist between the averages
obtained for different groups of transmitters by different groups of
observers. Similarly, comparatively large variations exist in these
averages even when taken by the machines (the large difference
between the sensation and machine measures is due to a difference
in the standard used, corrections for which are not made in this table).
Are these differences significant? Is product changing? That is,
are the manufacturing methods being adequately controlled? Are
these results consistent with a random variation in the causes con-
trolling manufacture? These are the questions that were raised in
connection with the interpretation of these data. The ordinary
theory of errors gives us the following answer. It will be recalled
that the standard deviation (or the root mean square deviation) of the
average ox is equal to ■ /-=•" Also, from the table of the normal
probability integral we find that the fractional parts of the area
within certain ranges are as follows: For the ranges X±a, X±2a,
and X±Zff, we have the percentages 68.268, 95.450, and 99.730
respectively. Obviously, it is highly improbable that the difference
between averages should be greater than three times the standard
deviation of the average, providing we assume that all of the samples
were drawn from the same universe: In other words, that all of the
samples were manufactured under the same random conditions.
The fourth column, then, indicates practical limits to the variations
in the averages. It is obvious, therefore, that the differences between
the averages are larger than could have been expected, if the same
system of causes controlled the different groups of observations. In
other words the differences are significant and must be explained.
54 BELL SYSTEM TECHNICAL JOURNAL
Why do these variations exist? We shall show in the course of
the discussion that the normal law is not sufficient to answer these
questions. We shall show also that the variations noted are largely
the result of the method of sampling used at that time. The sig-
nificance of the other factors given in this table is discussed later.
Why Is the Application of the Normal Law Limited?
Why can we not assume that the deviations follow the normal law
of error? This is
I Syx 2
where a is the root mean square error ■*] — - — and y is the frequency
\ n
of occurrence of the deviation x from the arithmetic mean and n is
the number of observations? If they do, the answers to all of the
questions raised in the preceding paragraphs can be easily answered
in a way which is familiar to all acquainted with the ordinary theory
of errors and the method of least squares. This is an old and much
debated question in the realm of statistics. Let us review briefly
some of the a posteriori and a priori reasons why the normal law has
gained such favor and yet why it is one of the most limited, instead
of the most general, of the possible laws.
A Posteriori Reasons. The original method of explaining the
normal law rests upon the assumption that the arithmetic mean
value of the observations is always the most probable. Since expe-
rience shows that the observed arithmetic mean seldom satisfies the
condition of being the most probable we may justly question the
law based upon an apparently unjustified assumption.
Gauss first enunciated this law which is often called by his name.
The fact that so great a mathematician proposed it led many to
accept it. He assumes that the frequency of occurrence of a given
error is a function of the error. The probability that a given set of
n observations will occur is the product of the probabilities of the n
independent events. He then assumes that the arithmetic mean is
the most probable and finds the equation of the normal law. Thus
he assumes the answer to the first question; that is, he assumes that
the most probable value is always the arithmetic mean. In most
physical and engineering measurements the deviations from the
arithmetic mean are small, and the number of observations is not
sufficiently large to determine whether or not they are consistent
APPLICATION OF STATISTICAL METHODS 55
with the assumption of the normal law. Under these conditions this
law is perhaps as good an approximation as any.
The fundamental assumptions underlying the original explanation
were later brought into question. What a priori reason is there for
assuming that the arithmetic mean is the most probable value? Why
not choose some other mean? 6 Thus if we assume that the median 7
value is the most probable, we obtain as a special case the law of
error represented by the following equation :
y
-ii«-*M (2)
where y represents the frequency of occurrence of the deviation x from
the median value and e is the Naperian base of logarithms. Both
A and h are constants. If, however, we assume that the geometric
mean is the most probable, we have as a special case the law of error
represented by the following equation :
y= A e -h* (log X -log a)* (3)
where in this case y is the frequency of occurrence of an observation
of magnitude X, "a" is the true value, and A and h are constants. 8
Enough has been said to indicate the significance of the assumption
that the arithmetic mean is the most probable value, but, why choose
this instead of some other mean? No satisfactory answer is available.
So far as the author has been able to discover, no distribution represent-
ing physical data has even been found which approaches the median
law. Several examples have been found in the study of carbon
which conform to the Taw of error derived upon the assumption that
the geometric mean is the most probable. If the arithmetic mean
were observed to be the most probable in a majority of cases, we
might consider this an a posteriori reason for accepting the normal
law. We find the contrary to be the case.
Furthermore, we find in general that the distribution of errors is
non-symmetrical about the mean value. In fact, most of the distri-
butions which are given in textbooks dealing with the theory of
errors and the method of least squares to illustrate the universality
"An average or mean value may he defined as a quantity derived from a
given set of observations by a process such that if the observations became all
equal, the average will coincide with the observations, and if the observations
are not all equal, the average is greater than the least and less than the greatest.
7 If a series of n observations are arranged in ascending order of magnitude,
the median value is that corresponding to the observation occurring midway
between the two ends of the series.
•.' A very interesting discussion of the various laws that may be obtained by
assuming different mean values is given in J. M. Keynes' "A Treatise on the
Theory of Probability."
56
BELL SYSTEM TECHNICAL JOURNAL
of the law are, themselves, inconsistent with the assumption of such
a law. Prof. Pearson was one of the first to point out this fact. He
considers among others an example originally given by Merriman 9
in which the observed distribution is that of 1,000 shots fired at a
target. The theoretical normal is the solid line in Fig. 5 and the
m r
■ dlftTBIBLlTIDN or lOOO BHOTB
ma MmW&Ml
..ABM iiinli
(fiffi nutJT
n; B : | i.sa I3T i ma .
sHwt mm iPtfiiti* ml
MM '■ PW Bffllffl
=$fe; nP£ iODBliaoc
Hrffifl IBBDBi
_S7T IhH ■£.= (J333
liliiBJUi' AiiiiMBjJij
iw^imiKrUi
co(t*eepo«s Atom
oo»wre»*e*j nrm
Fig. 5
observed frequencies are the small circles. When represented in this
way there appears to be a wide divergence between theory and ex-
perience. Of course, some divergence may always be expected as
a result of variations due to sampling; and, too, we must always
question a judgment based entirely upon visual observation 10 of a
graphical representation of this character. Prof. Pearson uses his
method — which will be discussed later — for measuring the goodness
of fit between the theoretical and observed distributions. He ll
finds that a fit as bad or worse than that observed could have been
expected to occur on an average of only 15 to 16 times in ten million.
We must conclude, therefore, that these data are not consistent
with the assumption of a universal normal law.
A Priori Reasons. From the physicist's viewpoint the origin of
the Gaussian law may be explained upon a more satisfactory basis.
■ "Method of Least Squares," Eighth Edition — Page 14.
10 This point will be emphasized later : — first, by showing that these data
appear consistent with a normal law when plotted on probability paper, and
second, by showing that some frequency distributions appear normal when
plotted even though they are not. The other data in this table will be re-
ferred to later.
11 Reference to the original article and a quotation therefrom given in the
eleventh edition of the Encyclopedia Britannica on the article "Probability."
APPLICATION OF STATISTICAL METHODS 57
It is that which was originally suggested by La Place. If, however,
we accept this explanation, we must accept the fact that the normal
law is the exception and not the rule. Let us consider why this is
true. 12
This method of explanation rests upon the assumption that the
normal law is the first approximation to the frequencies with which
different values will be assumed by a variable quantity whose varia-
tions are controlled by a large number of independent causes acting
in random fashion. Let us assume that :
a. The resultant variation is produced by n causes.
b. The probability p that a single cause will produce an effect A x
is the same for all of the causes.
c. The effect A x is the same for all of the causes.
d. The causes operate independently one of the other.
Under these assumptions the frequency distribution of deviations of
0, 1, 2 ... n positive increments can be represented by the successive
terms of the point binomial N(q+p) n where N represents the total
number of observations.
Under these conditions if p = q and « = «, the ordinates of the
binominal expansion can be closely approximated by a normal curve
having the same standard deviation. These restrictions are indeed
narrow. In practice it is probable that p is never equal to q, and it
is certain that n is never infinite. Therefore, the normal distribution
should be the exception and not the rule.
There is a more fundamental reason, however, why we should
seldom expect to find an observed distribution which is consistent
with the normal law. In what has preceded we have assumed that
each cause produced the same effect A x, and that the total effect in
any instance is proportional to the number of successes.
Let us assume that the resultant effect is, in general, a function of
the number n of causes producing positive effects, that is, let X = <f>(n).
Thus we assume that the frequency distributions of the number of
causes and of the occurrence of a magnitude X are respectively
y =/(»)
and
for two values of n, say n and n+dn, there will be two values of X,
say X and X-j-dX. The number of observations within this interval
of n must be the same as that within the corresponding interval of X.
"Bowley "Elements of Statistics," Part II.
58 BELL SYSTEM TECHNICAL JOURNAL
If the distribution in X is normal such that we have
1 (*-«)'
ff "S
then
yi= —7= e 2a ~ '
0-\/2t
1 [0(H) "«] a
;y= — ^=0'(n)e 2J— (4)
where a is the arithmetic mean value, therefore, the distribution
of the causes need not be normal; conversely if the causes are dis-
tributed normally, the observations will not in general be normal.
This idea is of great importance in the interpretation of observed
distributions of physical data. 13 To illustrate, let us assume that the
natural causes which affect the growth of apples on a given tree
produce a normal variation in the diameters of the apples. Obviously,
the distribution of either the cross-sectional areas or the volumes
will not be normal. 14 . If the distribution of the diameters is normal
as supposed, the arithmetic means of these diameters is the most
probable value. Obviously, however, neither the arithmetic mean
area nor the arithmetic mean volume will be the most probable,
because in general
^f(X)^=f(~^X) t (5)
TV TV
As already indicated, the deviations dealt with in the present investi-
gation were not small. The form of the observed distribution may
be expected, therefore, to depend upon the functional relationship
between the observed quantity and the number of causes. We shall
" Kapteyn, J. C. — Skew Frequency Curves — Groningen, 1903.
14 In the theory of errors this fact is taken into account by assuming that
the variations arc always small. Thus, if the variable X can be represented as
a function /' of certain other variables U\, b'--, . . . Um so that we have
X=F(U 1 , U tl ... Um),
we ordinarily assume that we can write this expression in the following form
X = F(ax + Mi, fl 2 + 1k, ... a m -\-u m ).
A further assumption is made that the it's are small so that 2nd and higher
powers and products of these can be neglected. Under these conditions the
distribution of X is normal and has a standard deviation given by the following
expression :
But, thus, we arc led to overlook the significance of the form of F, particularly
in those practical cases such as are of interest in the present paper where the
quantities u,, u 2 , ... u m are not small.
APPLICATION OF STATISTICAL METHODS
59
illustrate the significance of these ideas as an aid in the interpretation
of data by reference to the results of our study of the law of error of
the human ear in measuring the efficiency of transmitters.
Let us consider the problem of determining the minimum audible
sound intensity. Let us assume that there are n physiological and
psychological causes controlling this sensation measure, and that the
probabilities of the causes producing 0, 1, 2 . . . n effects are dis-
feceucrJcr OisreigyriON t>- 710 Oaxetrir.tra
Knee 70 OcTEeninc Mintrivm #ovibl£
So/no IriTCrtfs/ry
X
Ooeeeyzp
(Ljl
r.
4
3
s
33
33
-3
13
IB
19 .
/3f
£■3
-r
61
71
111
-/
/sa
159
Xfi
.10
.01
SOO
zoo
■ Zd
+ ■>
1
16 7
IS 9
16 Z
AO
IS
?
63
71
.90
ll!
.1
/O
ie
.00
a
*3
S3
£00
710
710
.1/0
j.ao
.6630
« CL*99 »rr£evSL9
ft, =■ oo*a
flt-nta
<% - . 09/ f
T. -.104
Fig. 6
tributed normally. Because of these differences in human ears
different amounts of sound energy arc required to produce minimum
audible sensations. What is the distribution of energies?
The data are given in Fig. 6. These have been previously reported
by Fletcher and Wegel of this laboratory. 15 The method of making
these measurements was described in their original papers. It is
sufficient to recall that the results are given in terms of pressures in
dynes per square centimeter. Seven hundred and ten observations
covering the frequency range of from GO to 8,000 cycles are included.
The data include results for both cars of 14 women and 20 men, and
one ear only for two women and two men. Only ears that had been
medically inspected as being physiologically normal were selected.
These results, therefore, include variations in the observations of a
single observer with those of different observers.
The natural logarithms of the intensities were added and the
average of these was obtained. The distribution of the natural
11 Fletcher, H. and Wegel, R. L. — Proceedings of the National Academy of
Science — Vol. VIII, pp. 5 — 6, January, 1922.
Physical Review, Vol. XIX, pp. 550 seq. 1922.
60 BELL SYSTEM TECHNICAL JOURNAL
logarithms of the intensities is given in the second column of the
table in Fig. 6. The smooth line is the normal curve based upon the
observed value of standard deviation. The distribution of the
logarithms of the intensities is normal. 16 The arithmetic mean of
the logarithms is the most probable. Therefore, the distribution
of intensities is decidedly skew, and the geometric mean intensity is
the most probable. Here, then, is an excellent example in which
it is highly probable that the distribution of the causes is random
and normal, but in which the resultant effect is not a linear function
of the number of causes. 17
Can We Ever Expect to Find a Normal Distribution
in Nature?
The answer is affirmative. If the resultant effect of the inde-
pendent causes is proportional to their number, the distribution
rapidly approaches normality as the number of causes is increased
even though p=f=q.
To show this, let us assume that the variation in a physical quan-
tity is produced by 100 causes, and that each cause produces the
same effect Ax. Also, let us assume the probability p to be 0.1, that
each cause produces a positive effect. The distribution of 0, 1, 2, . . . n
successes in 1000 trials is given by the terms of the expansion 1000
(.9 + .1) 100 . Obviously such a distribution is skew, p is certainly
not equal to q, and n is far from being infinite. If the normal law
18 In fact this is an exceptionally close approximation to the normal law.
This will be more evident after we have considered the methods for measuring
the goodness of fit as indicated by the other calculations given in this figure.
For the present it is sufficient to know that approximately 75 times out of 100
we must expect to get a system of observations which differ as much or more
from the theoretical distribution calculated from the normal law than the ob-
served distribution differs therefrom in this case. The fact that the second
approximation docs not fit the observed distribution as well as the normal —
i.e. the measure of probability of fit P is less — indicates that the observed value
of the skewncss k is not significant.
"These results are of particular interest to telephone engineers. The fact
that the distribution of the logarithms of the intensities is normal is consistent
with the assumption of Fechner's law which states that the sensation is pro-
portional to the logarithm of the stimulus. The range of variation (that is,
X =*= 3 a ) in different observers' estimates of the sound intensity required to
produce the minimum audible sensation is approximately 20 miles. The range
of error of estimate depends upon the intensity of sound and decreases as the
sound energy level increases. Thus for the average level which prevails for
transmission over the present form of telephone system in a three mile loop
common battery circuit it is less than 9 miles. Even at this intensity, however,
it is obvious that although scarcely any observers will differ in their estimates
by more than 9 miles, 50% of them will differ by at least 2 miles. These
results also furnish experimental basis for the statement made in the beginning
of this paper: that is, the variations introduced in the method of measurement
of transmitter efficiencies are large in comparison with the average efficiency.
vh
—
as-
^5
O ~ O
'p. c
-a 2-2 „
< c
o «
* i-* 1 *-*
3 3
HHHOiMNs«)")^OMn^mM^oiifl«iONi;ogooooooio
-H"1 , 0!X-1 , T*<OOOrOt^'«l<'-"iO'*5>OONOtNC-JC^roOOOOOOgOOr-
Ol^'OMNOO»00-lCONOi l OOO-lNMOHlONHOvO«'')>00> , Ot l tN'"0
H •* C "I M NOOifliOOf) rtioO(NN'HOv>OlO''5N'-i CO
^H rt ,-H rt <-H ^H lO
O NN *h cs cs ?0 co ^J r-4 ■**< ^ O;
_iO ooooo o o o o o o _ CO
OoOOOOOOOOOOOOOOOOOOOOO-* »-*
ioo>o>0'-o^'N'-inONMi^-l'oo-^ooce"* co
oNMino-HNiNOH^-gHiNOOOMO °°
OHOOOOOOOOOOOOOOOOOOOOOO "i
O")O 1 0M'0'O^ lf lO00OiNM''l | ')WNN00MOO O
O SON'fl'f-t^OONtO'OHOOHNtO'j'tOINH'-N —I
>r>
in-hC^ — OO O_O00_--r-i--^Or-l —J
OOO— cOOOOOOOOOOOCOOOOOCOO •*
© i^ o o -- ioi>o- j -^«-t'i | 0'i»'*in«ifl^N030 £}
0-^)lflHHON^'0''ONCOfO^I'-il'lNNOOOO CO
N-tano^O'*^NNON't"nMOO'omi»oiOH os
OO\NHOi0vMf)'-iO00f0'- | NO0M^i')iOOO'l , Ov , *>Ot , 1i')O>>O') l M'- |l 'l-1<
OONOOiOiOOitCOvrt-jiTiiOii^ONMifliflrtiOHOOOOOOOONO
^'ioio"*ao»*to-<0'Oo^'-"iovO'ON'-i 2°
„ — _ _ s
OO^O'flN^HifliflMaH^NNN^ftMNNiflNOOOOOOOOOW
N-00")C10l^-t-<W)^50fOONOvO'')M'-l 2°^
•H rt .-H r-l O O
OOH'-iM*O00OCMfl«0>H00 00 00OiHMO«l*NOOOOOOOO«O
MOino>0"^^»ovCNOMO»»')iw)Oi»in*-Hoooooc::ooo
«-"rOSOCO'0'4 , OOiOf v l>0 00-t ,M 5COCOr'J'— 5-—
'^Hir)iorOOvCO"+0^0\CO-t--c-iavO>oes-*
•-irO'OCO'-i' v 3 r O'- , l^l^- lr » r '5' _l ' _l
I fv| ", -f IT, O I-
+
>
—
"o
s*i
c- o o
k
w<
62
BELL SYSTEM TECHNICAL JOURNAL
were fitted to such a distribution, would it be possible to detect easily
any great difference between theory and observation?
Let us compare the two distributions. The data are given in
Table III. First, the average value must be the most probable in
order to be consistent with the normal law. It is, because the observed
most probable value corresponds to 10 successes, and the average of
_i_
Fig. 7
the hypothetically observed distribution is 9.998. This under ordi-
nary circumstances would be considered a close check between theory
and practice.
The normal distribution is given in the third column of the table.
Even though there is a difference between the frequencies given in
the second and third columns, would the average observer be apt to
conclude that the hypothetically observed distribution is other than
normal? He would probably base his answer upon a graphical
comparison such as given in Fig. 7. The solid line represents the
normal curve; whereas the frequencies given in the second column
of Table III are represented by circles. It is obvious that the normal
law appears to be a very close approximation to the terms of the
binomial expansion.
Thus we see that for even a small number of causes the difference
between p and q may be quite large, and yet the difference between
the distributions given by the binomial expansion and that given by
the normal law is apparently small and not easily to be detected by
ordinary methods. As ;/ increases the closeness of fit does likewise.
APPLICATION OF STATISTICAL METHODS
63
If p is equal to q, the number of causes must be very small indeed
before we are able to detect the difference between the terms of the
binomial expansion and those given by the normal law. To show
that this is true I have chosen a case corresponding to a physical
condition where there are only 16 causes and where p is equal to q.
The data are given in Table IV.
TABLE IV
Normal Law
Successes
C5 + .5) 16
with same <r
/
/i
.0000153
. 0000669
1
.0002441
.0004363
2
.0018311
.0022159
3
.0085449
.0087641
4
.0277710
.0269955
5
.0666504
.0647586
6
.1220825
.1209853
7
. 1745605
.1760326
8
. 1963806
.1994711
9
. 1745605
. 1760326
10
. 1220825
.1209853
11
.0666504
.0647588
12
.0277710
.0269955
13
.0085449
.0087641
14
.0018311
.0022159
15
.0002441
.0004363
16
.0000153
. 0000669
Obviously, therefore, the limitations imposed by the assumptions
as to the number of causes and the equality of p and q are not as
important as they might at first appear. It is probable that this is
one of the reasons why we find approximately normal distributions.
If, however, p is sufficiently small, the difference between the observed
distribution and that consistent with the normal law can easily be
detected. We shall show in a later section that this is true for Ruther-
ford's data. 1 "
Is There a Universal Law of Error ?
Obviously from what has already been said, the normal law is not a
universal law of nature. It is probable that no such law exists. We
do, however, have certain laws which are more general than the
normal. We shall consider briefly some of these types in an effort to
indicate the advantages that can be gained by an application of them
to physical data.
" Loc. cit.
64 BELL SYSTEM TECHNICAL JOURNAL
Binomial Expansion (p+q) n . We have already seen that the
distribution is approximately normal when p = q and w = °° . Following
Edgeworth 19 , Bowley 20 shows that if p=£q but « = »the frequency
y of the occurrence of a deviation of magnitude x is given by the
following expression where k represents the skewness 2l of the
distribution :
v =
This will be referred to as the second approximation.
If p is very small, but pn = \ is finite, we have the so-called law of
small numbers 22 which was first derived by Poisson. The successive
terms of the series e~M 1+XH \- — +.... J represent the chances of
0, 1, 2 ... n successes. Theoretically, if we are dealing with a dis-
tribution of attributes, 23 it is always possible to calculate the values of
"Cambridge Philosophical Transactions, Vol. XX, 1904, pp. 36-65 and 113-141.
"Loc. cit.
21 In statistical work the practice is followed of using the moments of the
distribution for determining the parameters of the frequency curve. The i th
moment Pi of a frequency distribution about the arithmetic mean is by definition
In calculating such moments it is necessary to consider the observations as
grouped about the mid-point of the class interval and unless this interval is
very small certain errors are introduced which can be partially eliminated by
applying Sheppard's corrections as given by him in Biometrika, Vol. Ill, pages
308 seq. If Ax be taken as unity, we have
" 1=0 n - i q ~ p
H2 = pqn=a- ypqii
im = pqn(q—p) l-6pq
H=3(pqn)*+pnq(l-6pq) &i ~ pgn
and if /> is approximately equal to q and n is large we have ak = \ rv and
N
°to = M
;24
N'
M It is of interest to note that several investigators have derived this law
independently. Thus H. Bateman derives this expression in an appendix to
the article of Prof. Rutherford and H. Geiger previously referred to. This
is, in a way, an illustration of the apparent need of a broader dissemination of
information relating to the application of statistical methods of analysis to
engineering and physical data. It is also of interest to note that this law has
been used to advantage in the discussion of telephone trunking problems.
~ 3 If the classification is based upon the presence or absence of a single
characteristic, this characteristic is often referred to as an attribute.
APPLICATION OF STATISTICAL METHODS 65
p, q and n from the moments of the distribution. 24 Even when p,
q and n are known, the arithmetic involved in calculating the terms of
the binomial is often prohibitive, and, therefore, it is necessary to
obtain certain approximations corresponding to the three laws of
error; that is, normal, second approximation, and the law of small
numbers. Tables for the normal law and for the law of small numbers
are readily available in many places, while those for the second ap-
proximation are given by Bowley. 25
Even under conditions where the binomial expansion does not hold,
Edgeworth has shown that it is possible to obtain the following general
approximation :
1 / x 2 \r, k (x x 3 \ .k°-f 5 5.t 2 5x 4 x 6 \
-K^K- 2 ^)]. m
This holds providing the observations are influenced by a large number
of causes, each of which varies according to some law of error but
not necessarily to the normal law.
Gram-Charlier Series. Gram, according to Fisher, 26 was the first
to show that the normal law is a special case of a more generalized
system of skew frequency curves. He showed that the arbitrary
frequency function F{X) can be represented by a series of terms in
which the normal law is the generating function (X). Thus
F(X)=c <t>(X)+c 1 4>'(X)+c 2 <t>"(X)+ ... (8)
where c , c u c 2 , etc., are constants which may be determined from the
moments of the observed data. This series is similar to that already
mentioned in the above equation (7) which Edgeworth has obtained
in several different ways. This law is of interest from the viewpoint
of either a physicist or an engineer in so far as it gives him a picture
of the casual conditions consistent with an accepted theoretical
curve. Thus, if either the causes of variation are within a certain
degree not entirely independent, or the errors are not linearly ag-
gregated, the observed frequency distributions may be expected to
conform to an equation such as 8. This equation has been found to
fit a much larger group of observed distributions than the normal law
21 See footnote 26.
:s See for example Pearson, K— Tables for Biometricians and Statisticians—
Cambridge University Press.
""Fisher, Arne— Theory of Probabilities— page 182.
66 BELL SYSTEM TECHNICAL JOURNAL
and the publication of the necessary tables by Fisher 27 and Glover 28
makes the study of such a curve more feasible. The author finds, for
example, that this series furnishes a much closer fit to the distribu-
tion of shots, Fig. 5, referred to above than any other that he has
tried.
Theoretically we should be able to improve the approximation by
taking a large number of terms of the series. Such a procedure,
however, involves the use of moments higher than the first four,
and the errors in these moments are so large as to make their use
impractical.
In spite of the uncertainty attached to the interpretation of the
physical significance of fitting any of these curves to data, one very
practical observation has been made : that is, if an observed series of
frequencies could not be fitted by a theoretical curve in any of the
ways already mentioned, careful consideration of the possible reasons
for the observed poor fit have in practically every instance suggested
the cause or causes thereof. We shall refer to only one practical
example.
The data have already been given above in Table II. It has been
noted that in this instance the variations in the averages of groups
of several thousand observations showed that the differences were
significant. If the observed distributions had been normal, it would
have been necessary to assume either that the methods of making
the measurements were different for the different groups of observers,
and for the different machines, or that the manufacturing methods
were experiencing a trend. Although the observed frequency curves
for the different groups were found to be smooth, the observed fre-
quencies could not be readily fitted by any curve previously de-
scribed. This naturally led to a search for the existence of any one
of a number of causes affecting the observations which might produce
such a divergence between theory and practice. One by one these
causes were found and eliminated and as they were the degree of fit
between the results of theory and practice increased. For example,
it was found that some of the groups of observations were for trans-
mitters assembled from only two or three lots of carbon. Trans-
mitters assembled from one lot of carbon had a different average
efficiency from those assembled from another lot. Naturally the
'' Fisher, Arne — Loc. cit. As noted by Mr. Fisher, page 214, the values of
<t> (.v) and its first 6 derivatives to 7 decimal places for values of x up to 4
and progressing by intervals of 0.01 were given by Jorgensen in his "Frekvens-
flader og Korrelation."
" 5 Glover, J. W. — Tables of Compound Interest, Functions, etc, — 1923 Edition
published by George Wahr, Ann Arbor, Michigan.
APPLICATION OF STATISTICAL METHODS 67
resultant distribution was a compound of a few separate but similar
distributions about different averages. When the distributions of
the efficiencies of the different lots of carbon were determined separ-
ately they were found to be consistent with the second approximation.
Thus, although it may be impossible to conclude that the a priori
assumptions underlying a given law of distribution are fulfilled because
the observations are found to be consistent therewith, nevertheless,
the fact that the observed and the theoretical distributions do not
agree suggests the necessity of seeking for certain typical causes
which may be expected to introduce such discrepancies. This point
is of special importance in connection with the study of ways of
sampling product in order to determine whether or not the manu-
facturing process is subject to trends. Thus, if a product is sampled
at two periods, and the distributions of both groups of observations
are found to be random about different averages, it is highly probable
that the difference indicates a trend in the manufacturing methods,
providing the difference between the averages is greater than 3 times
the standard deviation of the average. When, however, the two
distributions are found to be inconsistent with a random system of
causes, it is quite probable that the condition of sampling has not been
carefully controlled.
Hypergeometric Series. Pearson has shown several ways in which
a frequency distribution may be represented by a hypergeometric
series. Thus the chances of getting r, r— 1, . . . bad transmitters
from a lot containing pn bad and qn good and where r instruments
are drawn at a time may be represented by the terms of such a series.
More important, however, is Pearson's solution 29 of what he calls
the fundamental problem of statistics. He shows, following the line
of reasoning similar to that originally suggested by Bayes, that if in
a sample of k\ = (m + n) trials, an event has been observed to occur m
times and to fail n times, in a second group of k 2 trials the chances
of the event occurring r times and failing s times are given by the
successive terms of a hypergeometric series. We cannot consider
here the questions underlying the justification of this method of
solution, for, as is well-known, the application of Bayes' theorem is
questioned by many statisticians. We can profit, however, by the
broad experience of Prof. Pearson, for he has apparently accumulated
an abundance of data which are consistent with the theory.
The answer to this problem s of special importance in connection
with the inspection of product which in many instances runs into
millions yearly. We must keep the cost of inspection at a minimum,
=* Pearson, IC— Biomctrika, October, 1920— pp. 1-16.
68 BELL SYSTEM TECHNICAL JOURNAL
which means that the sample numbers must be small, and yet we see
from the solution derived from Pearson the significance of the sizes of
both the original and the second sample. Thus, he 30 shows that the
standard deviation <r is given by the equation
o» = * 2 /> ff (l+£). (9)
Multimodal Distributions. These occur frequently in engineering
work and particularly in connection with the inspection of large
quantities of apparatus. One such instance has already been referred
to in the discussion of the data given in Table II, and another is
illustrated by the data given in Fig. 1. Prof. Pearson 3l has developed
a method for determining analytically whether or not the observed
distribution is such as may be expected to have arisen from the com-
bination of two normal components, the mean values of which are
different. The method involves the solution of a ninth degree equa-
tion. As a result, the arithmetic work is in many cases prohibitive.
This method cannot be applied to the data given in Fig. 1 primarily
because the number of observations is not sufficiently great.
Pearson's Closed Type Curves. 32 One of the best known statistical
methods for graduating data is that developed by Prof. Pearson. His
system of closed type curves arises from the solution of the differ-
ential equation derived upon the assumption that the distribution
is uni-modal and touches the axis when y = 0. In the hands of Pearson
and his school great success has been attained in graduating data
collected from widely different fields, although primarily from these
of biology, psychology, and economics. The choice of curve to
represent a given distribution rests primarily upon a consideration
of a criterion involving two constants, /3i=v& and /3 2 , both of which
have been defined previously in footnote 21.
In the early study of the distributions of efficiencies of product
transmitters an attempt was made to apply this system of curves.
For example, the Pearson types are indicated in Table II. In no
instance, however, was it possible to obtain a very satisfactory fit
between the observed and the theoretical distributions. Further-
more, the arithmetical work required to calculate a theoretical dis-
tribution in this way is excessive. We must also consider what
physical significance can be attached to the different types of curves.
The answer is not definite. Under certain conditions the generalized
so Pearson, K.— Philosophical Magazine— 1907, pp. 365-378.
"Pearson, K.— Philosophical Magazine— -Vol. 1, 1901, pp. 1 15-119.
" Elderton — Frequency Curves and Correlation.
APPLICATION OF STATISTICAL METHODS 69
equation of Pearson breaks down to the normal law and the second
approximation. These, of course, can be explained as previously.
The fundamental equation, however, serves to cover the condition
where the causes are correlated. Thus, because of the lack of a
clear conception of the physical significance of the observed varia-
tions in the type of curves indicated in Table II, it was not possible
easily to set up experiments to find the causes of these variations.
For this reason preference has been given to the use of frequency
distributions derived upon a less empirical basis following the original
lines laid down by La Place, Edgeworth, Kapteyn, and others previ-
ously referred to. Another very practical reason for choosing the
latter type of curve is that it involves for the most part the use of
only the first three moments of the distribution instead of the first
four required for differentiating between the Pearson types. In
those cases where the interest is less of physical interpretation than
of graduating an observed set of data, preference may go to the more
generalized system of Pearson.
How Can We Choose the Best Theoretical Frequency
Distribution?
We have already briefly reviewed some of the different methods
for obtaining a theoretical frequency distribution from a consider-
ation of the moments of the observed frequencies. We have seen in
Table III that by using different methods we obtain different degrees
of approximation to the hypothetically observed distribution which
in this case corresponds to the terms of the binomial expansion
1000(.l-f-.9) 100 . Similarly from Fig. 5 it is seen that the Gram-
Charlier series is a much closer approximation to the observed dis-
tribution than that derived upon the assumption of the normal law.
In any given case we are naturally confronted with the question :
What is the best theoretical distribution? We shall consider four
methods for obtaining an answer.
The oldest, simplest, and in many instances the most practical,
is that of comparing graphically or in tabular form the theoretical
distribution with the one observed. This method is, however,
inaccurate and qualitative. It does not furnish us with a quantitative
method of measuring the closeness of fit between theory and practice,
and in certain instances it is absolutely misleading. It is of interest
to see how all of these things can be truly said of one and the same
method. The first two characteristics, that is, oldest and simplest,
are perhaps readily granted. It remains to be pointed out more
70
BELL SYSTEM TECHNICAL JOURNAL
definitely wherein the method is sadly deficient as a quantitative
measure, and therefore often misleading; whereas in certain instances
it may be, nevertheless, the only practical method that can be used.
Graphical Method. The graphical method itself may be subdivided
into two parts. Let us consider first the plot of the observed and
theoretical frequencies. As an example of the unsatisfactory nature
of this form of comparison, it is of interest to consider certain data
'Twelve Qce THea>rvM06 Tines, fifHeoH
or 4, voet, £EoanrD a Socces6-D0t#
f~eori7HEoerarS7&7TSTxsfipaeas&-6U'tlt
a—
Feequctrr
Tneoeerrc&L reconnvcr
f, r, *",?' "'if"
e
2
2
2
i
7
IZ
12
/4
IS
z
60
£S
JO
47
3
/9a
220
18*
/air
IB 4
■»
£30
4 93
•141
433
437
s
731
792
731
74 7
r'ii
A
14B
12*
as
136
93 7
7
S*7
792
632
34 1
S3 7
V
sse
413
&30
J 46
S3 J
9
2^7
22, ■
242
F4b
~47
IO
71
66
SO
77
79
II
II
12
IS
16
17
IZ
1
z
3
T0 7MJ1
4016
4 016
4016
40K,
4097
P
.OOI&
2409
4337 .3062
faeOtttLAt,
a*
re /".if
flontNrs.
LCULl*r£t>.
Ivor-. ..•ITS.
ooo
"'■
.'31
i 6139
■000
«Vi
734
P-.ae»
f
«J
t.Jtc
f •
51
"77
tt-MI PC* -OZb
t'.OOO
02 7
*6S0
9,-t
333
fr.2 7&7PC l ^-oiZ
NurtBEe or Successes
Fig. S
given by Yule 33 in which 12 dice are thrown 4,096 times, a throw of
4, 5, or G points being reckoned a success. If the dice are symmetrical
p = q = / / 2 and the theoretical distribution if given by 4,096 (K + H)' 2 .
the terms of which as given by Yule are presented in the third column
of Fig. 8. It is suggested that the reader, before going further,
consider the graphical and tabular representation of these data.
The smooth curve is the theoretical distribution 4,096(^+3 / 2) 12 -
It has been the author's experience to find that in practically every
instance in which this curve has been shown to an individual for the
first time that the impression is that which Yule evidently desires
to produce by the illustration: that is there is a very good fit between
theory and practice. This distribution is, however, not symmetrical :
it is skew. The dice used in this experiment were not symmetrical:
that is, p=f^q. How do we know that these statements are true?
' Let us consider the normal and second approximation as given
" Yule— "Introduction to the Theory of Statistics."
APPLICATION OF STATISTICAL METHODS 71
in the fourth, fifth, and sixth columns. 34 Obviously the degree of
fit is closest for the second approximation, although that between
the normal distribution and the observed frequencies is closer than
that between the terms of the binomial expansion and the observed
frequencies. To be sure, the normal law is only an approximation
to the point binomial when p = q and &=«. The normal distribu-
tion, however, is calculated about the observed average 6.139, instead
of about the theoretical average 6. If the dice are non-symmetrical,
the average will not be G, and, therefore, the center of the distribution
will be shifted after the fashion observed. The improvement in fit
corresponding to the normal distribution is therefore primarily
attributable to that introduced by shifting the center of the dis-
tribution indicating that pj^=q. However, if p=£q, the second ap-
proximation should improve the fit and for either value of k this is
found to be the case. Thus even though we cannot measure quanti-
tatively the improvement of fit, the qualitative evidence presented
in this figure is sufficient to warrant the conclusion that the dice were
non-symmetrical, and therefore, that the smooth curve is an unsatis-
factory graduation of the data. In fact, by using a quantitative
method for measuring the goodness of fit to be discussed in a suc-
ceeding paragraph, it follows that only 15 times out uf 10,000 can
we expect a divergence from theory as large or larger than that ex-
hibited by the frequencies corresponding to the point binomial.
We have also previously called attention to the fact that in Fig. 7
the eye does not serve to differentiate satisfactorily between the dis-
tribution calculated upon the assumption of the normal law and that
given by the binominal expansion when the conditions under-
lying the normal law are far from being satisfied.
Regardless of these criticisms, such graphical methods cannot be
entirely dispensed with. Thus the graphical representation of the
data given in Fig. 1 shows very clearly that the distribution is prob-
ably bimodal, although with no more observations than are available
it is practically impossible to show that this is true in any other way.
Instead of plotting the frequency y of occurrence of a variable of
magnitude x as ordinate, and x as abscissa, the practice is often
followed of plotting as ordinate the percentage of the total number N
of observations having magnitudes of .v or less/ 1 - 1
Any curve <$> (v, .v)=0 may be replaced by a straight line. 36 In
31 Two values (if k were calculated as indicated in the lower right hand corner
of the figure.
85 Heindlhoffer, K. and Sjovall, H. — Endurance Test Data and their Interpre-
tation — Advance paper presented at the Meeting of the American Society of
Mechanical engineers, Montreal, Canada, May 28 to 31, 1923.
"Runge, C— Graphical Methods, p. 53.
72
BELL SYSTEM TECHNICAL JOURNAL
this way we can transform the integral curve into a straight line by
choosing an .r-scale proportional to the integral from to x of the
probability curve. 37 When plotted in this way, a normal distribution
appears as a straight line on such paper. At first it may appear very
simple to determine whether or not the data conform to a straight
line, but in practice this is not always so easy. Thus, we have seen
that the distribution of shots presented in Fig. 5 is not normal, but
Fig. 9
when these results are plotted on probability paper we have the
curve given in Fig. 9. The reader should be cautioned that in such
a case there is a temptation to consider that the observed points are
approximately well fitted by the straight line, although this is not
the case.
Probability paper could be ruled for different theoretical distribu-
tions, but in its present form it serves only to determine whether or
not the distribution is approximately normal. Its use leaves much
to be desired in the way of a quantitative measure of the degree of
fit between the theoretical and observed distributions.
Calculation of a, R\ = V k, and B 2 . Let us consider what informa-
tion can be obtained as to the best theoretical distribution from only
a consideration of the first four moments of the observed frequencies.
Let us consider the values of k and B 2 presented in Table V. These
have been calculated for the point binomial (£+5)'' where p, q and n
have been given different values. For the normal law corresponding
to p = q and «=<», we have k = and B 2 = 3. Thus, if in a practical
87 Whipple, G. C. — The Elements of Chance in Sanitation — Franklin Institute
Journal, Vol. 182, July, December, 1916— pp. 37-59 and 205-227.
©
o
o
<&
O CO O O— i o o
OOOOOO-hO
fO f) fO fO f5 "5 f) ^f
II
Ji
O O O O ©f3 o
1 1 1 1 1 1 7
o
o
II
<CL
CO CO & O "J "1 ''I 'f
oooooooo
-«
-f CMO I— OO VC ©
003^<MO--©
i- 1 i i i*ys
II
aS
aoaoiNcoi^N
NNNtOfOOHN
-«
OC t- O fO t- iH p
O O — W "O O «*3 o
1 1 1 1 77s
1 ' 1
V
<&
C000ON©r*3O>«-<>O
NNCSrotocOini^
>AJ
O N CO S >D O O
>— 1 O ^ o>
1 1 1 1 77s
II
<&
COOO^N*'0'( I
cMcNCNt<)Or*3r*jfO
-Si
i"O<OOvc0MM
O h n in M 10 ")
1 1 1 1 722
1 1 1
II
<&
0"+<-' v ocoioioio
lOlflOONM--^
01 r-i cm "■; -f vC •— «-<
NIOO
CI
-«
C-tiO'OMO'O
I I I •->-+ m ©
1 1 1 1 I ^ IT}
11 II
«,
10 *0 I'- CC Os Os Q\ Cn
74 BELL SYSTEM TECHNICAL JOURNAL
case we find an observed distribution for which k = and /3 2 = 3, it
is highly probable that the distribution is approximately normal. It
is true, however, that in sampling from a universe in which p = q and
»=°°, the observed values of k and /3 2 will seldom be exactly equal to
and 3 respectively. Then we must ask what range of values may be
expected in these two factors for distributions which are practically
normal. For such cases the variations in k and B 2 are practically
I q I 24
normal 38 and have standard deviations ok = \ -rr and <r a =-*|-5r
\ N ft \ N
where N is the number of observations. Thus, theoretically any
series of observations for which the calculated values of k and B 2 fall
within the ranges 0±3<r & and 3±3o- ft ma y have arisen from a normal
universe. Since, however, the errors o> and a* of sampling are so
large, this method does not furnish a very practical test for distribu-
tion consisting of only a few observations. This is particularly
true since, even for very skew distributions, the values of k and B 2
do not differ much from and 3 respectively (see Table V). If, how-
ever, the number of observations is large, the values of k and B 2 in
themselves often indicate very definitely that the observed frequencies
are not consistent with the normal law. For example the calculated
values of k and B 2 given for the inspection data in Table II show
conclusively that in practically every instance the observed data
could not have arisen from a normal universe. So long as we do not
use Pearson's system of curves, all that these two factors indicate
is that the observed data do or do not conform to the normal law
and in this respect their use is limited as is that of the probability
paper mentioned above.
In order to show that the factor B 2 is not in itself a very sensitive
measure of the variability from the normal law, I have considered
the following special case. Let us assume that the observed dis-
tributions can be grouped into two parts depending upon whether
or not the observations cluster about the average Xi or Xz measured
from a point which is the arithmetic mean of the entire distribution
taken about a common origin. This corresponds to the practical
case such as that indicated by Fig. 1 which as already pointed out
often occurs in practice.
3H For a critical study of the conditions under which the probable errors of
these constants have a real significance, reference should be made to a discus-
sion of this problem by Isserlis in the Proceedings of the Royal Society, series
A, Vol. 92, pp. 23 seq. — 1915. Obviously even for the normal distribution all of
the moments will be skew. This follows from a consideration of equation 4.
APPLICATION OF STATISTICAL METHODS 75
The value of /3 2 for the entire distribution is then given by the
following expression :
/32_ " (2^+2-v*)^
*(Xi iM3 2- y i+^ 2M3 2^+^ 4 2-^ «w 2^
+ (2-^+2-^ 2 >
where i/i,- and 2 M« refer to the adjusted ith moments of the observations
about their respective mean values. Let us assume that X = Xi = X 2 ;
k l = ko = 0; i/3 2 = 2/3 2 = 3 ; 1^ = 2^; 2 yi = 2 :y2; and ai = a ' wherC
2^1 and 2>'2 represent the total numbers of observations in the first
and second groups respectively. It may be shown by substitution in
this equation that, if |z|=|<xij, fr> = 2.5, whereas, if |x[ = |lO<ri|, /3 2 = 1,
approximately. Thus, if the numbers of observations in each of the
two sub-groups are the same and the component curves are normal,
the value of /3 2 for the entire distribution about the mean of the two
will, in general, decrease as \x\ becomes large in comparison with
\ai\. Differences in j3 2 of this magnitude are difficult to establish.
Furthermore the skewness is zero, and therefore does not indicate
the bi-modal character of the distribution.
Let us consider the case where |a A" i| =j^2|; fei = ^2 = 0; 1/32 = 202 = 3;
2- Vl = a 2 3 ' 2: W = W- If- fl = 1 ° and |Xi|=|o-i| then /3 2 = 8+ whereas
if |Xi|=|lO<7i|, then j3 2 = 100, approximately. 39 Thus, for compara-
tively wide differences in the averages, it requires a large number of
observations in order to increase the precision of j3 2 to such an extent
as to prove the significance of deviations in this factor of the magnitudes
noted above.
The skewness in this case is not zero and its significance could be
established with a comparatively small number of measurements. In
any of the above cases a carefully constructed plot would serve to
indicate the bimodal characteristic of the curve better than the study
of the factor /3 2 .
Pearson s Criterion of Goodness of Fit. A much more powerful
" Here again it should be noted that the values of /3 2 are independent of the
actual frequencies of each of the two groups and depend only upon the ratio
of these frequencies and upon the ratio of \Xi\ to <n.
76 BELL SYSTEM TECHNICAL JOURNAL
criterion has been developed by Prof. Pearson 40 in a series of articles
in the Philosophical Magazine. It is true that this test for goodness
of fit cannot be used indiscriminately. In fact the application of this
criterion is subject to numerous limitations clearly set forth in the
original papers by Pearson and in more recent articles on the mathe-
matics of statistics. In the use of the method it is necessary that
these be kept in mind by the individual making the original analysis
of the data. Irrespective of these facts, however, the method itself is
one of the most useful tools available for measuring in a quantitative
way the "goodness of fit" between two distributions. The significance
of the values of P given in Figs. 5, 6, and 8 now become evident.
Engineering Judgment. The fourth very practical and one of the
most useful methods of comparing the theoretical with the observed
distribution is that of applying common sense or engineering judgment.
To quote from a recent article of Prof. Wilson 41 we have: "And as
the use of the statistical method spreads we must and shall appreciate
the fact that it, like other methods, is not a substitute for, but a
humble aid to the formation of a scientific judgment." Even with the
use of all the statistical methods known to the art, it remains im-
possible to determine the true nature of the complex of causes which
control a set of observations. We can present plausible explanations,
but we can never be sure that they are right. Sometimes we can
present two plausible explanations and then we must fall back on
engineering judgment or common sense to decide between them. A
striking illustration of this fact is presented in the following paragraph.
Prof. Pearson 42 has recently presented measurements of the cephalic
index of a certain group of skulls. The object of the investigation
was to determine if variation had gone on to such an extent as to
indicate the survival of the fitter inside a homogeneous population,
or the survival of two races both of which were in existence many ages
in the past. Pearson shows that, by a solution of a nonic equation.
40 If we divide the entire range of variation into s equal intervals for which
the observed frequencies are fi, fi,....fs and the corresponding theoretical
frequencies are fj, fj t f' Pearson calculates the function
r (f'-ir-
2^
from which he is able to determine the probability that a series of deviations
as large as, or larger than, that found to exist could have arisen as a result
of random sampling. Tables have been prepared which give the probability of
fit in terms of the number of intervals into which the entire range has been
divided and of the value of x-
41 Wilson, E. B. — The Statistical Significance of Experimental Data — science
—New Series, Vol. 58, 1493, October 10, 1923, pp. 93-100.
"Philosophical Magazine, Vol. 1, 1901— pp. 110-124.
APPLICATION OF STATISTICAL METHODS
11
he is able to find two component distributions which when added
together approximate very closely to the observed frequencies. The
observed data are given in the second column of Table VI and the
frequencies of Prof Pearson's compound curve are given in the third
column of the table. The probability of fit between these two dis-
tributions is seen to be approximately .96, which is indeed very
TABLE VI
Rowgrave Skulls *
Cephalic
Observed
Compound
2nd Approxi-
Ui-jr-
(/»-/)'
Index
Distribution
Distribution
mation
/
/i
ft
/l
h
67
1
1
I
68
1
2
2
.50
.50
69
3
4
4
.25
.25
70
8
7
8
.14
71
13
11
14
.36
.07
72
13
18
22
1.39
3.68
73
33
28
30
.89
.30
74
36
39
39
.23
.23
75
49
50
48
.02
.02
76
59
59
55
.29
77
69
65
59
.25
1.69
78
70
66
60
.24
1.67
79
54
60
58
.60
.28
80
58
52
53
.69
.47
81
40
43
46
.21
.78
82
31
35
39
.46
1.64
83
25
28
32
.32
1.53
84
28
23
26
1.09
.15
85
21
20
21
.05
86
20
17
16
.53
1.00
87
9
14
13
1.79
1.23
88
10
11
10
.09
89
6
8
7
.50
.14
90
10
6
5
2.67
5.00
91
2
4
3
1.00
.33
92
3
2
2
.50
.50
93
2
I
1
1.00
1.00
94
1
1
1
95
1
1.00
s
675
675
676
15.77
23.75
Probability o
flit?
.957
.694
Ave.=* = 78.S46
<r= 4.612
k = .521
Ave. =02 = 3. 181
a- z = .178
«r = .126
Ave. =«ri =
.0943
189
* Phil. Mag., Vol. I, 1901, pp. 115-119.
high, meaning, of course, that 96 times out of 100 we may expect
to find a system of deviations as large or larger than that actually
found. The author finds, however, that the theoretical distribution
78
BELL SYSTEM TECHNICAL JOURNAL
(column 4) based upon the assumption of the second approximation is
also a very close fit to the observed frequencies, the probability of fit
being in this case .69. As a result of these calculations shall we con-
clude that the distribution is composed of two normal components as
indicated in Fig. 10, or shall wc conclude that the distribution is homo-
geneous? In other words, do the skulls belong to two or to only
WTW:
TFFT
mm
^-Tjrc
IOO
pestevLD.
-jl»C
eulTUiSITIVE.i
-t ; ^^OB.Mfl^.. , .
-i— I - . cumjiiiH-iHVfe
ij l-|'rT~ P"£" ; I ce.ph«L'c 'ihoi>
bi,L
90 35
Fie. 10
one race? The measure given by the probability of fit is, of course,
in favor of the first alternative. It is highly probable, however, that
if we had been given the observed distribution without any discussion
of what it meant we would have decided that it probably was con-
sistent with the assumption of the random system of causes such as
might underlie the second approximation.
In other words, if we had been given merely the above set of skull
measurements, it is reasonable to suppose we might have concluded
that the distribution was homogeneous. However, when our judgment
is colored by the facts which cannot be presented in the array of
observed frequencies we must conclude that it is highly probable
that the observed data have arisen from a non-homogeneous pop-
ulation.
Statistical methods alone do not answer all of the questions that
aie raised in this problem nor do they answer them in many others.
There is almost always room for judgment to enter.
Thus, analyzing a group of measurements of some characteristic
of a large number of transmitters, it often becomes necessary to
determine whether or not they can be subdivided into normal com-
APPLICATION OF STATISTICAL METHODS
79
ponents as in the above problem. In our case the subgroups corre-
spond to different kinds of carbon. Here, as in the data given by
Pearson, it often has been found necessary to base our final conclusion
partly upon facts not revealed by the data themselves.
The integral curves corresponding to the normal and observed
distributions are given in Fig. 10 in order to show that they do not
EDWGBRVE. SKULLS
■: -I::
Fig. 11
serve to indicate the difference between the observed and theoretical
distributions nearly as well as the actual frequency curves also given
in this figure. Fig. 11 presents the result on probability paper. In
this case the probability curves are as good as the frequency curves
for showing the divergence between theory and observation. It will
be recalled that this is not true for the similar curves given in Fig. 9.
Summary Statement of Suggested Method to be Followed in
the Analysis of Engineering and Physical Data
We have briefly reviewed the different methods for determining
the best theoretical distribution to represent observed data. The
following four steps indicate the ordinary procedure:
1. Obtain the first four corrected moments.
2. Calculate the average, standard deviation, k and /3 2 , and their
standard deviations.
3. Calculate the theoretical distribution of distributions warranted
by the circumstances.
80
BELL SYSTEM TECHNICAL JOURNAL
4. Apply one or more of the four methods of comparing the theoreti-
cal and observed distributions to determine which one is
theoretically the best. 43
An illustration of the method of applying this form of analysis to
inspection data on transmitters is indicated in the schematic chart
Fig. 12. The object of the inspection of apparatus in the process
Product i/fitilnlrMiTr^Vi)
Period £Bl*
No Manufactured _2£
Sample Tested' 2^_
Characteristic Tested.
Frequency
f,
Theoretical
Frequency
If. - f t )'
Corrected Momenta
NumPer of Instruments between I, and L t
*«. n,tt**~ 4nr>- K.I ^a A"^n i i T rvm M a nUif" if -^ /'
Fig. 12
of manufacture is obviously to determine the most probable law of
distribution, and from this to determine whether or not there is any
indication of a trend in the quality of the product. In the light of
what has been said, it is obvious that a complete report of this char-
acter should contain the items called for in Fig. 12. The corrected
43 If the observed distribution could not have arisen from a random system
of causes, it may be advisable to attempt to transform it into an approximately
random one, such as was done in connection with the data in Fig. 6.
APPLICATION OF STATISTICAL METHODS SI
moments and the factors, such as the average, standard deviation,
k and (3 2 should be given. These factors provide us with measures of
the lack of symmetry, and can be used as pointed out in the previous
sections of this paper. Recording this amount of data makes it
possible for anyone interested, either to check the calculations of the
theoretical frequencies and the conclusions derived therefrom, or to
calculate a different theoretical distribution based upon fundamentally
different hypotheses in a way such as has been illustrated already
in the discussion of the distribution of measurements of the cephalic
index, as given in Fig. 11.
In most instances, however, it is highly probable that the man who
originally prepares the chart is charged with the responsibility of
choosing the best distribution, and, therefore, the chief interest of
those reading the report is centered upon the conclusions indicated
therein. The graphical representation of the observed distribution
by means of the histogram is hopeful. The comparison of this with
the theoretical curve represented by a solid line shows qualitatively
whether or not the product is changing. The probability of fit gives
a quantitative measure of the degree of fit. The set of curves given
in Fig. 12 is drawn to illustrate a condition which may sometimes
happen when, for example, the standards used in the machines have
been changed. This is only typical of the results which may be
expected. Obviously, the form of such reports designed to meet
specific conditions will vary. That presented above is only typical
of one which has been found to be of value in presenting the analysis
of the results of inspection of certain types of apparatus.
Some Advantages Derived from a Comparatively Complete
Statistical Analysis
It has been pointed out that the value of either a physical or an
engineering interpretation of data depends upon the success attained
in deriving the best theoretical distribution. This is the equation
which fits the observed points best, and which, if possible, can be
interpreted physically. The previous discussion indicates the way
in which different causal relationships tend to produce typical fre-
quency distributions, and also the way in which statistical methods
may be used in finding a theoretical distribution which yields a
physical interpretation.
This point has been illustrated by several examples. It has been
shown that by a proper choice of theoretical curve a very close ap-
proximation to an observed distribution can be obtained. This
3
to
W
-J
CO
H
■S
OiOlOONO-H^rtOOHflO
-"•0>— it-hcnocnOnOO"-<coO
.-. ,-< IT) —1
n£>
IT) ~t<
■* On
CN lO
d
f
**-l
^
»0")Of>fSMI»HO-Hff)0
iHNHHOOMOO«-ifoO
CN IT} -H
rt n©
©'
■^
1^ C5 i/} -t N -h N Ov O -i Q
'O'O'-i ") O M O l» -t o> ©
i-h i-H <-* I— >-c
-f OO
00 ©
00 CO
CO
*->
**~i
©•-icoc*)rNoC«-'ro 10>0 POO
OI"-COO>OnO^-h <^l CO no
0'0^000<MC}©'-<00"70
00 cn © «-h
o
IT) *H
\© -f
OO
CO
CN
1
^
OOS"5(Ji&ON")0'OiOOOO
0000"ll»Oi-'MO'0'NOOO o
OOOOiO\Nf-lQMi')")i'l<300
o
-f o
00 o
nc ©
00
rtiororor-lO'O -+^h cn On Cn On
cn
iOOO")MOif)'OON o
r-l -t if) if) -t M « n©
CN
c -
o aj
in i_ ■*
3 JS^-
OOOI>>OOfOi')NeOCO'fiOO>'')HOO>0
'I'OiOWlMOlTO'ON O
>-{ -t> irj iO -* CN >-i VO
CN
JLaw of
Small
Numbers
/i
OOO-tOMflCCt^OMOiHTtHOOiO
ifl h o fl O & iO -t C IN h O
r-l -t io if) fO M -h O
CN
2nd
Approxi-
ma iion
OOM")NONifJCOOiOvMMO"lHOOi'l
-I f) 'O iO fO N -H no
CN
t Normal
Law
M 1^ 00 "5 O0 fO ") On »t"0 iH O
rH ro -t iO 't M >-i vo
CN
Observed
Frequency
OOONf)") l ')N00f)0» l '1NO')'O-'-
100»CNW50NfO'l'MH * » *
CN CO 'O 'Ttf< CN »-H *#*
OO
Q u
no."
CNfi,
Number of
Particles
roc-lrHO'-'fl'O'fi'l'ONCOaiO'-iNflTl 1
■t-i
1
■^
+
i Q
' «
IcN
II
-*
b
CO
O on s
1 «.> w
~ oo
•O, W IH b
O bb
bo e jy
Stfl •-
.■5 t
— ca
■A —
■a £.2 S
2 Si
r *i in
c
ll":
•3 n -
0, . cr
m
5
rfl
?
o
R)
—
'A
ca
5 <
£h
APPLICATION OF STATISTICAL METHODS 83
has already been indicated in Table III. To emphasize this point,
however, let us consider once more the distribution of alpha particles
given in Table I. These data together with various theoretical 44
distributions are given in Table VII.
Let us consider the data given in Table I by following the pro-
cedure of analysis outlined in the previous section. The factors k
and /3 2 when compared with their errors should indicate whether or
not the distribution is normal. As shown in Table VII, k and 2
differ from and 3 respectively, by more than 3 times their respective
standard deviations. As has already been pointed out, this is suffi-
cient evidence to indicate that the distribution is not normal. In
order to show, however, that if we follow the next step and calculate
theoretical distributions based upon the assumption of the different
laws; that is, in this case, normal, second approximation, and the
law of small numbers, we are naturally led to the choice of the best
distribution. This choice is materially influenced by the measure
of the probability of fit as recorded in the table. The law of small
numbers is obviously a very close approximation to the observed
frequencies.
One of the obvious things to do in this problem, but one that has
not been done previously, is to calculate the values of p, q and n, and
from them the terms of the binomial expansion 2608(/> + </)". The
probability of fit between the terms of this, expansion and the observed
frequencies is the highest given in the table. This increases the
evidence that the distribution is random. It also does nore. It
serves to establish the facts that the probability p that an alpha
particle will strike the screen is .046, and that the maximum number
of alpha particles which may ever be expected to strike the screen
is of the order of magnitude of 84. Granted then that we can always
find the most probable theoretical frequency distribution, let us
consider next the influence that the result may have in our determina-
tion of the most probable value, the number of observations between
any two limits and the casual relationships governing the distribution.
Let us consider first the dependence of the most probable value upon
the type of distribution. In our present work in the study of carbon
the resultant distributions have been in most instances either random
or such that through a proper transformation they could be reduced
to such. For any distribution consistent with the second approxima-
"Thc source of all distributions previously calculated arc indicated. The
Poisson-Charlier series is similar to the Gram-Charlier series, except that the
law of small numbers is the generating function. It serves as an admirable
method of graduating certain classes of skew distribution as illustrated by this
example and by that given in Table III.
84
BELL SYSTEM TECHNICAL JOURNAL
tion the most probable value is at a distance — — from the arithmetic
2
mean. Many distributions have been found for which k lies between
.5 and unity, and, therefore, this difference is from ^ to 3^ of the
standard deviation. Thus, the efficiencies of certain standard types
of transmitters are found to conform to such a law, and the difference
between the modal and average values is of the order of magnitude of
0.4 mile.
Obviously the geometric mean of the sound intensities (Fig. 6)
and not their arithmetic mean is the most probable. The difference
between the two is quite large. The difference between the arithmetic
mean and the modal value for groups of data such as given in Fig. 1,
Tables II and VI are quite large. To use again the illustration ol
the alpha particles the observed most probable number is 4; whereas,
the observed average 45 is 3.87. Judging from the best theoretical
distribution the most probable number of alpha particles is 3. Choos-
ing the number 3 it is seen that either of the other two numbers differ
from this by approximately J^ the standard deviation. Such results
are, however, not confined to the work of the present investigation
nor to the examples previously cited as is evidenced by the data given
in the last column of Table VIII.
TABLE VIII
N = Number of
Observations
Source of
Data
Percentage
Within
Percentage
Within
X±2a
Percentage
Within
X±3a
Average
—Modal
1000
251
9154
2162
368
675 Table VI . . .
Normal Law
'54
66
10
79
84
66.6
78.1
67.7
70.1
73.4
68.7
64.26
97.2
94.8
95.5
95.1
94.6
94.1
95.44
99.6
97.6
99.6
99.3
97.0
99.6
99.73
.803
1.042
.031
-.311
.422
.247
* Elderton "Frequency Curves and Correlation," published by C. & E. Layton,
London, 1906.
We should not leave this phase of the discussion, however, without
pointing out that in a large number of purely physical experiments a
sufficient number of observations has not been taken to make it pos-
sible to choose the best theoretical distribution. In general more than
4S Of course, such an average has no significance, except for a continuous
distribution.
APPLICATION OF STATISTICAL METHODS 85
100 observations are required. Thus, in Prof. Millikan's 46 determina-
tion of the electron charge e only 58 observations were made. The
values of a, k, and /3 2 for this distribution are .128 units, -.196 and
2.358. Even though the observed distribution is consistent with a
normal system of causes, values of k and /3 2 may be expected to occur
which differ from and 3 respectively, as much as these observed
values do. In this case even if k is real and not a result of random
sampling, the correction to be added to the average in order to obtain
the most probable value is insignificantly small.
Next let us consider the problem of determining the number of
observations between any two limits. The physicist is ordinarily
concerned with the probable error : that is, the error such that Yi of
the observations lie within the range X± probable error. Its mag-
nitude for the normal distribution is .6745(r, and the errors are dis-
tributed symmetrically on either side of the average. It is interesting
to note that the magnitude of the probable error is also .6745<r for
the second approximation, but that the errors are not distributed
symmetrically on either side of the average.
Another important pair of limits is that including the majority
of the observations. For the normal law 99.73% of the observations
are included within the range X±Za which, therefore, is often called
the range. Not a single example has been found, however, of a
distribution for which the observed number of observations within
this range is less than 95% even though the distribution is decidedly
skew. In fact it is seldom less than 98%. If, however, we have a
case such as that represented in Table II where groups of observa-
tions have been taken in what is technically known as different
universes, and then averaged together, the average result is not the
most probable, and the standard deviation of the average is not
inversely proportional to the square root of the number of observa-
tions. Since this point is of considerable importance, it is perhaps
well to state it in a slightly different way. Thus, let us assume that
we have a thousand samples of granular carbon which possess inherent
microphonic efficiencies differing by comparatively large magnitude.
Transmitters assembled from any one of the groups of carbon cover a
range of efficiencies. If we choose a sample of 10,000 instruments,
5,000 from each of two lots of carbon which do not possess the same
inherent efficiency, we cannot expect, for reasons already pointed
out, that the observed distribution will be normal. The average of
these observations will not in general be the most probable value,
and the standard deviation of the average will not be equal to the
"Millikan, R. A.— The Electron— University of Chicago Press.
86 BELL SYSTEM TECHNICAL JOURNAL
observed standard deviation divided by the square root of the number
of observations, in this case 10,000.
We have already seen, however, that it is possible to detect such
errors of sampling, since in general the distribution cannot be fitted
by the second approximation or Gram-Charlier series. If the theo-
retical distribution is either normal, second approximation, or the
law of small numbers, the number of observations to be expected
between any two limits can be readily determined from the tables.
Experience has shown that in every instance where it has been possible
to represent the observed distribution in any of these three ways, the
data obtained in future samplings have always been consistent with
the results to be expected from the theory underlying these three laws.
It will be of interest to note the data given in columns 3, 4, and 5 of
Table VIII and to compare the theoretical percentages (last row) for
the different limits with those observed.
In closing it is of interest to point out further the significance of
some of the results discussed in this paper in connection with the
inspection of equipment. Here we must decide upon a magnitude
of the sample to be measured in order to determine the true percentage
of defective instruments in the product. If p is the percentage
defective, and q that not defective, then the standard deviation about
the average number found in a sample of n chosen from N instruments
is
**=pqn(l~).
In practice, however, we never know the true value of p unless we
measure all of the apparatus, and this is impractical. In our calcula-
tions we must therefore use some corrected value. We find, though,
that the average value of p is in most instances the one that must be
used. Assuming that we choose a value of p, the distribution of
defectives in N' samples of n in number will be represented by the
distribution of N'(p+q)". If one of the samples is found to contain
a percentage of defectives, which is inconsistent, that is, which is
highly improbable as determined from the distribution of N'{p-\-q) n ,
it indicates that the product is changing.
If, however, we take into account the effect of the size of the first
sample in respect to the second as indicated by Pearson, 47 we see that
the distribution of N' samples may be different from that given by
the binomial expansion. In accordance with this theory, if in a first
sample of 100, 10% of the sample is found to possess a given attribute,
" Pearson, K. Loc. cit. Foot note 30.
APPLICATION OP STATISTICAL METHODS
87
the distribution of the percentages to be expected in 1,000 such
samples is indicated by the last column of frequencies in Table III.
In order to show graphically how this distribution differs from that
DisTRiftuTiQH of Successes
OF 1000 SAMPLta
Num&eb or Successes
Fig. 13
corresponding to the binomial expansion these two sets of frequencies
are reproduced in Fig. 13. The difference between them is a striking
illustration of the significance of the size of the samples used in con-
nection with the inspection of equipment, providing we accept Pear-
son's results.